Methods for high throughput genome analysis using restriction site tagged microarrays

ABSTRACT

A method for high-throughput analysis of genomic material originating from complex biological systems, including complex microbial systems and a method of detecting changes in a genomic material using restriction site tagged (RST) microarrays and sequence passporting technique (in particular microarrays containing NotI-clones). Using the present invention method, methylation or silencing of specific alleles, homozygous and hemizygous deletions, epigenetic factors, genetic predisposition, etc, information which is particularly useful in diagnosis and treatment of cancer diseases, can be detected. The RST microarrays and passporting can also be used for qualitative and quantitative analysis of complex microbial systems.

FIELD OF THE INVENTION

The present invention pertains to a method of detecting changes in agenomic material using restriction site tagged (RST) microarrays andpassporting technique, which can be used for detecting methylation orsilencing of specific alleles, homozygous, hemizygous deletions,epigenetic factors, genetic predisposition, etc, information which isparticularly useful in diagnosis and treatment of cancer diseases. TheRST microarrays and passporting according to the present invention canalso be used for qualitative and quantitative analysis of complexmicrobial systems.

BACKGROUND OF THE INVENTION

Genomic subtractive methods in principle are very useful foridentification of disease genes including tumour suppressor genes.However, among many suggested techniques only a modified variant ofgenomic subtraction called Representational Difference Analysis (RDA,Lisitsyn et al., 1993) and RFLP subtraction (Restriction Fragment LengthPolymorphism)(Rosenberg et al., 1994) have been reproducibly succesfulin cloning deleted sequences. Three main drawbacks limited wide use ofthese related methods: both are very complicated and laborious, they arevery sensitive to minor impurities and experiments result in cloningonly a few deleted sequences. It is important to note that these methodsonly work well with enzymes not being associated with CpG islands.Methylation-sensitive-representational analysis (MS-RDA, Ushijima etal., 1997) has more specific aims, i.e. they work with CpG Islands, butstill is not avoided limitations of the original RDA. Moreover,differentially cloned products usually do not have any connections withgenes. Deletions of non-functional regions occur frequently in the humangenome and cloning of such segments will not yield valuable information(Lisitsyn et al., 1995). RDA is also unable to detect differences due topoint mutations, small deletions or insertions, unless they affect aparticular restriction enzyme recognition site. Another source ofartefacts is the PCR amplification after the first hybridization stepand before the nuclease treatment. The presence of excess driver DNA canresult in a reduced efficiency of the amplification tester:testerduplexes due to the opportunity for the residual driver:driver anddriver:tester duplexes to act as competitors. As RDA is based mainly onspecific PCR amplification of desired products and use many cycles(95-110), it suffers from a “plateau effect” that is characterised by adecline in the exponential rate of accumulation of amplificationproducts (Innins and Gelfand, 1990). However, the major problem resultsfrom the inefficiency of the multiple restriction digestion and ligationreactions that are used in this method and leads to the generation offalse positives.

The presence of genetic alterations in tumours is now widely accepted,and explains the irreversible nature of tumours. However, observationson tissue differentiation indicated that it shares something in commonwith carcinogenesis, i.e. “epigenetic” changes. Now, DNA methylation inCpG sites is known to be precisely regulated in tissue differentiation,and is supposed to be playing a key role in the control of geneexpression in mammalian cells. The enzyme involved in this process isDNA methyltransferase, which catalyzes the transfer of a methyl groupfrom S-adenosyl-methionine to cytosine residues to form5-methylcytosine, a modified base that is found mostly at CpG sites inthe genome. The presence of methylated CpG islands in the promoterregion of genes can suppress their expression. This process may be dueto the presence of 5-methylcytosine that apparently interferes with thebinding of transcription factors or other DNA-binding proteins to blocktranscription. DNA methylation is connected to histone deacetylation andchromatin structure, and regulatory enzymes of DNA methylation are beingcloned.

In different types of tumours, aberrant or accidental methylation of CpGislands in the promoter region has been observed for many cancer-relatedgenes resulting in the silencing of their expression. The genes involvedinclude tumour suppressor genes, genes that suppress metastasis andangiogenesis, and genes that repair DNA, suggesting that epigeneticsplays an important role in tumourigenesis. The potent and specificinhibitor of DNA methylation, 5-aza-2-deoxycytidine (5-AZA-CdR) has beendemonstrated to reactivate the expression of most of these malignantsuppressor genes in human tumour cell lines. These genes may beinteresting targets for chemotherapy with inhibitors of DNA methylationin patients with cancer, and may help to clarify the importance of thisepigenetic mechanism in tumourigenesis. Spontaneous regression ofmalignant tumours used to enchant researchers, but it has now beenobserved that genes inactivated by hypermethylation are frequentlyinvolved in tumours that relatively often undergo spontaneousregression. Carcinogenic mechanisms of some carcinogens seem to involvemodifications of an epigenetic switch, and some dietary factors alsohave the possibility to modify the switches.

Review articles in the literature make it clear that methylation is abasic, vital feature/mechanism in mammalian cells. It is involved inhereditary and somatic cancers, hereditary and somatic diseases,apoptosis, replication, recombination, temperature control, immuneresponse, mutation rate (i.e. in p53). Through methylation food caninduce cancer, etc., it is believed that it can be used for diagnostic,prognostic, prediction and even for direct treatment of cancer.Inactivation of DNA methyltransferase is lethal for mice. Based on thegrowing understanding of the roles of DNA methylation, several newmethodologies have been developed to make a genome-wide search forchanges in DNA methylation.

There are four main genome-wide screening methods (see Sugimura T,Ushijima T, 2000) for testing methylation in human genome: restrictionlandmark genomic scanning (RLGS, Costello et al., 2000),methylation-sensitive-representational difference analysis (MS-RDA),methylation-specific AP-PCR (MS-AP-PCR) and methyl-CpG binding domaincolumn/segregation of partly melted molecules (MBD/SPM). Although eachof them has their own advantages, none of them is suited for large-scalescreening since all four are rather inefficient and complicated; theycan be used only for testing a few samples. For example, after analysisof 1000 clones isolated using MBD/SPM, nine DNA fragments wereidentified as CpG islands and only one was specifically methylated intumour DNA.

Recently developed microarrays of immobilized DNA open new possibilitiesin molecular biology. These DNA arrays, containing either cDNA orgenomic DNA, are fabricated by high speed robotics on glass substrates.Probes that are labeled by different colors are hybridized. In one suchhybridization thousands of genes or genomic DNA fragments can beanalyzed allowing massive parallel gene expression and gene discoverystudies. In pilot experiments microarrays with immobilized P1 and BACclones DNA demonstrated that they could be used for high resolutionanalysis of DNA copy number variation using CGH (comparative genomehybridization). It has been suggested that this approach can work ifinserts of human DNA in the cloning vectors are larger than 50 kb. Inthe future, when microarrays with P1 and BAC clones covering the wholehuman genome will be created, this approach will most likely replacecoventional CGH. Clearly, construction of such microarrays with mappedP1 and BAC clones is very expensive, laborious and time consuming.Construction of such microarrays cannot be achieved in a single researchlaboratory. If small-insert NotI liking clones could full the samefunction this will open the way to construct such microarrays for CGHanalysis for a single research group and for many organisms. PACs andBACs covering the whole human genome are not available yet.

Pollack et al., 1999 suggested to use cDNA microarrays for genomic DNAcopy number changes but small size of cDNA clones and high ratio ofbackground hybridization compared to real signal makes this suggestionproblematic.

In the fall 2000 Affymetrix launched the selling of GeneChipHuSNPMapping Assay. These microarrays contain 1.494 SNP loci. In thepromotion papers it was shown that this microarrays can be used for thedetection of loss of heterozygosity (LOH). However 13% of SNPs failed inthe majority of samples whereas only 354 SNPs were informative in oneparticular experiment.

Lucito et al. (2000) used for the detecting copy number fluctuations intumour cells modification of RDA technology. In this method BglIIrepresentations were used in conjunction with DNA microarrays. As thereare many small BglII clones in the human genome (150.000) it will be noteasy and cheap to make comprehensive microarrays with unique clonescovering the whole human genome.

Presently, there are some methods available to analyze complex microbialmixtures, e.g. by enzyme analysis (Katouli et al., 1994) which requiresgrowth of colonies outside the body, or analysis of the compositionfatty acids in stools which gives crude indications of the compositionof the normal flora (refs.), however all them have obvious limitations.

The application of culture-independent techniques based on molecularbiology methods that can overcome some shortcomings of conventionalcultivation methods. In recent years the approaches based on PCRamplification of 16S rRNA genes have been most popular. One modificationof the approach utilized fingerprinting of all the species in the gutusing, for instance, denaturing gradient gel electrophoresis (DGGE) withPCR amplified fragments of 16S rRNA genes. In another application, PCRamplified fragments of 16S rRNA genes were directly cloned andsequenced. These studies yielded important information however intrinsicdisadvantage of the approach limits its application. The problem is that16S rRNA genes are highly conserved and therefore the same sequencedfragment can belong to different species. It is also important to keepin mind that in fingerprinting experiments similar fragments canrepresent different species, and different fragments can represent thesame species.

SUMMARY OF THE INVENTION

In view of the drawbacks associated with the prior art methods foranalysis of genomic material originating from complex biologicalsystems, there is a need for uncomplicated, quick and reliable genomeanalysis methods.

Therefore, the object of the present invention is to provide novel andunique techniques for analysis of genomic material originating fromcomplex biological systems, including complex microbial systems. Themain objects of the present invention are the following:

One object of the present invention is to prepare and to use NotI-clone(in general PCR fragments, oligonucleotides, etc.) microarrays forstudying methylation and/or copy number changes in eukaryotic genomesfor diagnosis, prognosis, identification of cancer causing genes. NotImicroarrays are the only existing microarrays giving the opportunity todetect copy number changes and methylation simultaneously. This includescomparison of normal and malignant cells at genomic and/or RNA level;comparison of primary tumours and metastases; analysis of familiessuffering from hereditary diseases including cancers; and diagnosticsand disease prediction.

Capability to establish differences between normal and tumour cells isinstrumental for cloning cancer causing genes and for early diagnosisand prevention of cancer. It is also very important for differentiation,development and evolution studies.

Another object of the present invention is to provide techniquesallowing qualititative and quantitative analysis of complex microbialsystems, such as the normal flora of the gut.

A further object of the present invention is to prepare NotI sequencingpassports (“NotI passport”) (collection of NotI tags: short sequencessurrounding genomic NotI sites) and to use them to study the sameproblems as were mentioned above for NotI microarrays.

Wide screening of genomic material using RST encounter many problems,e.g. the size of the human genome/microbial mture and the number ofrepeat sequences. We have solved these problems by developing a newmethod for labeling genomic DNA, where only sequences surrounding NotI(or any other restriction) sites are labeled (tagged), herein calledNotI Representation (NR).

In the present invention, Restriction Site Tags (RSTs) are generatedfrom thousands of microorganisms or human genomes and used for thegeneration of NotI RST microarrays passports which describe uniquely notonly individual human cell/organism or bacterial strains but most or allthe members of a microbial flora of e.g. in the gut.

With the NotI or RST genome scanning method according to the presentinvention, large scale scanning of microbial genomes on a quantitativeand qualitative basis is possible.

From the results of our experiments, we have shown that it is possibleto create a large database containing NotI microarrays passports, i.e.NotI microarray images. Many samples of colon flora have been comparedto determine their exact composition.

The present invention procedure is universal, i.e. we can use any otherenzyme for creating “RST microarray passports”. Moreover, anybiochemical or chemical approach cutting DNA (RNA) in a specificposition scarcely distributed along DNA (RNA) can be used. For example,it can be enzyme like cre-recombinase or chemically modifiedoligonucleotide forming triplex DNA and initiating DNA break. Thepolymorphism of NotI representations can be increased by using severalenzymes in addition to BamHI, e.g. BclI, BglII, HindIII etc. In pilotexperiments we have produced NotI microarrays from gram-positive andgram-negative bacteria and have shown that even very similar E. colistrains can be easily discriminated using this technique. Using theabove mentioned technique we can identify important pathogenic bacteriain the human organism.

These ‘NotI microarrays passports’ can be produced for individuals,normal/tumour pairs, different cell NotI Representation (NR). A pilotexperiment using NR probes demonstrated the power of the method, and wesuccessfully detected Chr.3 NotI clones deleted in ACC-LC5 and MCH939.2cell lines.

Such NotI RST microarrays can be prepared for any human or any groups ofhumans, who for example suffer from the same specific disease, in orderto detect a certain disease which cannot be detected by other means.NotI RST microarrays can also be prepared for any mammal (like cattlesor dogs) or microbial organism.

NotI arrays will speed up cancer research very significantly and canreplace CGH, LOH and many cytogenetic studies.

The NotI scanning approach will find mainly deleted, amplified, ormethylated genes but it will also identify polymorphic and mutated NotIsites. Comparing these NotI passports can give a clue to understandingmany diseases and other fundamental biological processes.

Using the present invention method of producing RST microarrays,restriction enzyme tagged (RST) microarrays for any enzyme can becreated. The microarrays according to the present invention represent anovel type of microarrays, which is completely different from theexisting ones (oligonucleotides, cDNA, genomic BAC/PAC clones).

To be able to establish differences between individual compositions ofthe normal gut flora will be instrumental for future analysis of how thenormal flora composition is influenced by diet, special foods,geographical location, colon, ovarian, etc. cancers and other diseases.It has particularly wide applications for cancer research.

The present invention method will probably have strong impact both onbasic science and on human and animal health, agriculture, medicine,pharmacology, etc.

We propose to use our NotI clones as a complement to microarrays basedon P1 and BAC clones covering the whole human genome. Microarrays basedon small-insert NotI linking clones have been developed, and can have asimilar function. Approximately 10.000-20.000 NotI clones, covering thewhole human genome and containing 10%-20% of all genes (40%-50% of themare not present in ESTs microarrays) are already available.

In order to achieve what is described above, the present inventioncomprises the following embodiments:

In one embodiment of the present invention provides a method forpreparing nucleic acid or and/or modified nucleic acid referencematerial bound to a solid phase, comprising the steps of

-   digesting nucleic acid and/or modified nucleic acid reference    material using biochemical and/or chemical approaches, to obtain    sequence fragments surrounding a specific recognition site,-   selecting said nucleic acid and/or modified nucleic acid sequence    fragments associated with a specific recognition site.

Said reference material is digested by a first restriction enzyme and/orone or more second restriction enzymes, e.g. endonucleases, such ascre-recombinase,

In one embodiment of the present invention the recognition sites of thefirst endonuclease is scarcely distributed along said genomic materialand is located adjacent to gene sequences, and the recognition sites ofsaid one or more second restriction endonucleases are more frequentlyoccurring along said genomic material than the sites of the firstendonuclease.

In another embodiment of the present invention the digestion by thefirst and second restriction endonucleases are performed simultaneously,and different linkers are ligated to the ends resulting from cutting bythe first and second restriction endonucleases, respectively, whichlinkers are designed such that when primers are added in order to makePCR reactions, only the fragments containing ends resulting from cuttingby the first restriction endonuclease will be amplified.

In still another embodiment of the present invention the referencematerial is first digested by the one or more second restrictionendonucleases, the ends of the thus obtained fragments are self-ligatedinto the form of circular nucleic acid and/or modified nucleic acidmolecules, and any linear fragments remaining after self-ligation areinactivated before digestion with the first restriction endonuclease,whereby the linear fragments resulting from the digestion by the firstendonuclease are subjected to PCR amplification.

In these embodiments the first restriction endonuclease is NotI, or anyother restriction endonuclease, the restriction sites of which occurs inproximity to CpG islands in the genomic material.

The first restriction endonuclease can also be NotI, PmeI or SbfI, or acombination of two or more of said endonucleases, and the secondendonuclease can be BamHI, BclI, BglII or Sau3A, or a combination of twoor more of said endonucleases.

Said nucleic acid and/or modified nucleic acid reference material can beselected from RNA, DNA, peptides or modified oligonucleotides, or acombination of two or more of said materials.

In the present invention nucleic acid and/or modified nucleic acid isbound to a solid glass support in the form of a microarray. However, thepresent invention is not limited to using glass microarrays. Solidphases such as filters, e.g. nylon filters, coded beads, cellulose, suchas nitrocellulose, or other solid supports can also be used to bindnucleic acid and/or modified nucleic acid. In general DNA,oligonucleotides, etc. bound to a solid phase can be used.

The genomic material that can be used according to the present inventioncan be derived from one or more humans, from different locations in thebody/bodies and at the same or different points in time. Said genomicmaterial can be derived from bacteria from the gut, skin or other partsof the human body. However, it can also be derived from any organism,bacteria, animal, or plant, or product produced therefrom, or from anysubstance wherein genomic material can be contained, especially air andwater.

The present invention also pertains to the fragments that can beobtained using the present invention, and the nucleic acid or and/ormodified nucleic acid microarrays containing these fragments.

The present invention further pertains to representations of the genome,or of a part thereof, of an organism, comprising multiple copies of thenucleic acid and/or modified nucleic acid fragments, or a selectionthereof, obtained by means of the present invention method.

These representations, in liquid form, are hybridized to the nucleicacid and/or modified nucleic acid fragments present in the form of saidsolid phases.

Said representations can be used for discriminating between differentgenomes, detecting methylations, deletions, mutations and other changeswithin genomic material obtained from the same individual at differentpoints of time, or in the genomic material obtained from one individualas compared to a standard representation obtained from at least oneother individual, or a combination thereof.

In addition to the above-mentioned applications, these representationscan be used for:

-   -   studying methylation and copy number changes in eukaryotic        genomes for diagnosis, prognosis, identification of cancer        causing genes, etc,    -   genotyping different microorganisms (viruses, prokaryotic,        eukaryotic),    -   studying biocomplexity and diversity of complex biological        systems, i.e. human gut, bacterial flora in water, food, air        resources,    -   identifying pathogenic organisms in different sources including        complex biological mixtures,    -   producing passports (images of microarrays hybridizations,        databases containing tag sequences) for different purposes: to        describe organisms at different conditions, i.e. different ages,        disease/healthy, infected/uninfected etc,    -   identifying new organisms, e.g. bacterial species,    -   producing microarrays (DNA- and oligo-based) to study all above        described features,    -   verification and maintenance of large biological        collection/banks, i.e. verifying cell lines and individual        organisms for higher organisms and confirming the purity of the        particular strain for microbial species,    -   producing kits for labeling and hybridization with microarrays,    -   producing kits for making sequence tagging (passporting), and    -   producing oligo microarrays to analyze sequence tags,

Finally, the present invention also pertains to a NotI CODE genomicsubtraction method based on the use of the above described fragments.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1. General scheme for the NotI-CODE subtractive procedure.

FIG. 2. Southern hybridization of NotI clones showed differenthybridization. Clone names are shown at the bottom. N—normal DNA, L—DNAisolated from lung cancer cell line ACC-LC5.

FIG. 3. General principle of using NR for NotI microarrays.

FIG. 4. NotI microarrays profiling of deletions/methylation in microcellhybrid MCH 939.2 (A), cell line ACC-LC5 (B), and primary RCC tumors #196(C) and #301 (D). Representative images of microarrays (1) are orderedaccording to physical map of chromosome 3. One-dimensional clustering(2) is based on average normalized red/green ratios of fluorescent data(red, R>3; green, R<0.3). For (A) and (B) normal and tested DNA werehybridized together. NR for MCH903.1 (the whole chromosome) was labeledred and NR for MCH939.2 (3p.14-p22 deletion) was labeled green.Similarly, NR for normal lymphocyte DNA was red and small cell lungcancer line ACC-LC5 was labeled green. The red clusters demonstrate asignificant overrepresentation of complete chromosome 3 or normal DNA.The green clusters—under representation of normal DNA. For (C) and (D)one step of NotI-CODE subtraction procedure was performed and singlecolor hybridization was done. The green clusters demonstrate thesignificant overrepresentation of normal DNA. Grey color marks controls.

FIG. 5. General scheme of the experiment. (microbial flora)

FIG. 6. Flow chart diagram explaining generation of 85 bpoligonucleotide containing information about 19 bp NotI-tag

DETAILED DESCRIPTION OF THE INVENTION

In the literature it has been suggested and demonstrated that NotI sitesare practically exclusively located in CpG islands and are closelyassociated with functional genes. Thus NotI sites are very usefulmarkers not only for physical but also for genetic mapping.

The present inventors have created high-density grids that contain50.000 of NotI clones originating from 6 representative NotI linkinglibraries and generated more than 22.000 unique NotI sequences (withstringent criteria 16.000) containing 17 Mb information. Analysis ofthese sequences demonstrated that even short sequences surrounding NotIsites is a source of important information allowing efficient isolationof new genes and the study of carcinogenesis.

We have a developed new approach for constructing NotI lining libraries(Zabarovsky et al., 1990) that give possibility to generaterepresentative NotI linking libraries both in lambda phage and inplasmid form (Zabarovsky et al., 1994a). Since the procedure is quiteeasy and reproducible, it is possible to construct libraries from manysources.

Using the present invention NotI (RST) microarrays, based on the shortsequences surrounding NotI sites or in general on restriction sitetagged sequences (RSTS), complex biological systems, including complexmicrobial mixtures, can be qualitatively and quantitatively analysed.

In the present invention study NotI microarrays for human chr.3 (150clones) were established and employed to compare chr 3 renal, lung,breast and nasopharyngeal cancers.

NotI Microarrays for Genome Wide Scanning

Recently we have sequenced 25.000 NotI clones and identified among them16.000 unique clones. These clones that cover the whole human genome andcontain 10%-20% of all genes (40%-50% of them are not present in ESTsmicroarrays) are already available.

The NotI microarrays can be used for testing tumour genomic DNA ingenome wide NotI scanning (e.g. for deletion/amplification studies).Such arrays will speed up cancer research very significantly and canreplace LOH (loss of heterozygosity), CGH (comparative genomehybridization), and other cytogenetic studies.

The fundamental problems for genome wide screening using NotI clonesare:

-   -   (i) the size and complexity of the human genome;    -   (ii) the number of repeat sequences; and    -   (iii) the comparatively small size of the inserts in NotI clones        (on average 6-8 kb).

To solve this problem, the special primers were designed and specialprocedure was developed to amplify only regions surrounding NotI sites,so called NotI representation (NR). Other DNA fragments were notamplified. We suggested to use NotI microarrays for genome screening incombination with this new method for labeling genomic DNA where onlysequences surrounding NotI sites are labeled.

NotI microarrays images can be generated for particular cells, tumours,and individuals. By comparing images from normal and tumour cells, thedifferences between them will be defined. Using this information, NotIlinking clones will be identified that differ between two (or more)DNAs. These clones can be used for further analysis and for isolatingcomplete genes. Polymorphism in NotI sites is very frequent andaccording to the literature 43.5% of NotI sites are differentlymethylated or polymorphic.

Analysis of our database of 16.000 unique NotI sequences (two sequencescan belong to the same NotI clone) showed that practically all of themare connected with genes and located at the 5′ end of the genes.Comparison with completely sequenced chr. 21 and 22 revealed interestingobservations. Chr. 21 contains 122 NotI sites (methylated andunmethylated) and Ichikawa et al., 1993 have cloned 40 NotI sites toconstruct the complete NotI restriction map with 43 NotI fragments. Fromthese 40 clones our database contained 38 (95%) and additional 13 NotIclones (11%). Therefore using random sequencing we could isolate 27.5%more NotI clones than in the study of Ichikawa et al., 1993 where theyfocused their efforts in cloning NotI clones only from chr. 21.Altogether, from 390 possible NotI sites in chr. 21 and 22 our databasecontain 163 (42%) clones. Moreover, 18 clones that were identified inour work (5%) were not present in public sequences. These clonescontained polymorphic NotI sites. Thus, from our data we can concludethat unmethylated (our database contain only unmethylated NotI sites)NotI sites represent appr. 42% and polymorphic —5% of all possible NotIsites. Our estimation is that human genome contains 15.000-20.000 NotIsites and 6.000-9.000 of them are unmethylated in a particular cell.Thus screening with NotI microarrays will be equivalent to screeningusing 6.000-9.000 gene associated single nucleotide polymorphisms (SNP).

Comparing the prior art genomic chips with the present invention NotImicroarrays it is easy to see that NotI microarrays give additionalinformation to the deletion mapping: they can be used for geneexpression profiling and methylation studies (see Table 1).

For preparing the probe for SNP chip 3.000 PCR primers and 24 separatereactions are needed and probe for NotI microarrays is prepared using1-2 primers in one reaction tube. Using the same NotI clones we are ableto simultaneously obtain information about:

-   -   (i) deletions/amplifications;    -   (ii) methylation;    -   (iii) gene expression profiles.

All these features of NotI microarrays are extremely important for largescale experiments.

The pattern of hybridization of NR to the NotI microarrays represent amicroarray passport for the DNA used for preparing NR.

We will now summarize the differences between CpG islands microarrays(below abbreviated to CGI, see Yan et al., Cancer Res. (2001) 61:8375-8380), which we presently find is the closest prior art, and thepresent invention RST microarrays (below abbreviated to RST, see Table2).

In the present invention sequences surrounding the same restriction siteare cloned, whereas in CGI sequences originate from sequences betweentwo restriction sites.

In principle, using the present invention technique, any restrictionenzyme can be used for RST, but only limited number for CGI.

CGI can detect methylation, but not (in general) deletions (hemi- orhomozygous) or amplifications of unmethylated sequences. RST can detectboth copy number changes and methylation. CGI can detect deletion of theallele if it is methylated in normal genomic material and if it isdeleted (unmethylated) in tumour material, this process is howeverinefficient as the vast majority of the important genes are unmethylatedin normal genomic material, and the majority of methylated genes innormal genomic material are various kinds of repetitive elements, e.g.LINE, Long Interspersed Element (or sequence or repeat).

In CGI the total human DNA is labeled, in RST only 0.1-0.5%, and thisDNA contains 10-fold less repeats than the total human DNA.

Many clones in CGI contain repeats and ribosomal DNA, whereas the RSTonly comprise genes containing unique human sequences. This veryimportant difference is the result of completely different techniques ofconstructing microarrays (they use methyl-CG binding column, which isnot used in the present invention).

For RST microarrays short OLIGOS (oligonucleotides 20-100 bp) can beused, which is not possible for CGI.

Incomplete digestion do not create problems for RST, but produceartificial signals in CGI.

Using RST hybridization is obtained when the site is not methylated,whereas in CGI hybridization only occurs if it is methylated.

CGI microarrays can only be used to study methylation in highvertebrates. This can also be done with RST, which in addition to that,also can be used for genotyping (passporting) any organism. It meansthat RST microarrays can be used to genotype bacteria and viruses forexample, but not CGI.

Our RST application contains complementary aspects, i.e. the generationof NotI (RST) tags (passports) by sequencing. Sequencing can be doneusing different techniques including sequencing by hybridization tomicroarrays. No such complementary approach is possible with CGI.

NotI-CODE (or RST-CODE in general) can be used together with RSTmicroarrays to remove in one step contaminating sequences. No suchtechnique can be applied for CGI. Existing subtractive procedures likeRDA cannot be employed, since they are not efficient enough to deal withthe high complexity of total human genomic DNA.

Using RST microarrays it is possible to discriminate betweendeleted/amplified and methylated sequences. To achieve this aim NRshould be produced using DNA that is unmethylated (it can be done bydifferent approaches: limited PCR amplification after first digestionwith restriction enzyme(s), enzymatic demethylation, etc.).

NotI Passporting

We originally planned to use SAGE technique for this purpose. Serialanalysis of gene expression (SAGE) allows for both a representative andcomprehensive differential gene expression profile (Velculescu et al.,1995). The idea of the approach is that for each of the mRNA molecule ashort 9-bp sequence tag is produced (including recognition site for thetagging enzyme it is 13 bp). Then these tags are ligated intoconcatemers and cloned. One sequencing reaction produces information fortens of RNA molecules. Thus by sequencing a few thousands clones one cane.g. evaluate all of the estimated 10.000 to 50.000 expressed genes in agiven cell population. We have tried the SAGE technique for producingNotI tags but this was unsuccessful. Complexity of genomic DNA inmicrobial mixtures is at least 100 times more complex than thecomplexity of mRNA in eukaryotic cells. All RNA molecules must be taggedin SAGE but in our case, approximately one out of 250 molecules shouldbe tagged. We propose to produce one tag for each 100-1.000 kb, but inSAGE one tag is produced for 256 bp. At the same time, a 13 bp tag isnot enough for unambiguous identification of sequences in genomic DNA.That is why we have developed a new procedure called Not passporting.

In this work we used the following modification. Genomic DNA wasdigested with NotI and ligated to the linker with NotI sticky ends. Thislinker contained BpmI recognition sites. This restriction nuclease cut16/14 bp outside of the recognition site. Ligation mixture was digestedwith this enzyme to generate 11/9 nucleotide tags adjacent to the NotIsite. This DNA sample was ligated to ZNBpm linker and PCR amplified withantiuniver and Z1univer primers to generate 85 bp duplex. The final PCRamplified molecule contains 17 bp sequence tag which is missing 2 bpfrom the original NotI site and therefore the whole NotI tag contains 19bp. NotI passports were experimentally produced for E. coli K12, E.cloaceae R4 and K. pneumoniae B4958. Experiments with samples obtainedfrom mice demonstrated that the quality of DNA isolated from intestineof feces was sufficient to obtain NotI tags. The NotI passports uniquelyidentified these species and among 96 tags none was common for these 3bacterial species. Of course, ditags or concatemers also can be createdfrom these 85 bp products. We believe that new high-throughputtechnologies like MPSS will make sequencing of single tags moreefficient approach than creation of concatemers. However, the design ofthe experiments can be different in different laboratories. As wementioned above, this restriction site tagging procedure can be adaptedto any recognition site for restriction nuclease. For comprehensiveanalysis of flora composition, use of several passports will beadvantageous: different bacteria possess very different CG content. Itmeans that with NotI passports bacteria having high CG content (NotIrecognition site: GCGGCCGC) will predominantly be represented, but usingfor example SwaI passports (Swal: ATTTAAAT), bacterial genomes with highAT content will be analyzed more carefully. Use of 2-3 differentpassports can significantly increase the sensitivity of the analysis andalso be favourable for different applications, e.g. cancer risk,medication, diet, etc.

We tested the potentiality of the passporting approach and analyzed 25bacterial species that were completely sequenced. The number ofrecognition sites for rare cutting restriction enzymes in thesebacterial species are given in Table 3 below. It is easy to see that all25 microbial species have different number of NotI recognition sites andtherefore can be distinguished by NotI passporting. Moreover, from theTable 3 we can see that PmeI and SbfI restriction enzymes were even moreinformative.

Table 4 showed results of comparisons of different strains of E. coliand Helicobacter pylori for NotI, PmeI and SbfI enzymes. All of thesestrains were uniquely described by any of these enzymes and thus theinventive method can really discriminate between different species andstrains, which was not possible with 16S rRNA genes sequencing.

All sequenced E. coli strains contained altogether 1 312 tags (includingthe tags to the left and to the right of the NotI recognition site) forthese 3 enzymes, and among them only 139 were not unique. We can takeinto the account that two tags describe the same NotI site and thereforeone tag can be the same but another can be different and therefore bothtags still represent a unique NotI site. In such a case only 82 tagswere not unique. These results demonstrate the power of the approach.

In our comparative experiments we did not use only bacterial genomesequences but the whole human genome sequences (including EST and EMBLentries). In such experiments, in the majority of the cases, NotI tagswere unique even with the allowance of 1-2 sequence mismatches.

As mentioned above, the strongly advantageous feature of NotIpassporting is the internal control. If a NotI site from a particularbacterial species contains for example NotI tag100 and NotI tag 101,then both tags should be obtained in approximately the same quantities.If only NotItag100 is present, then it most probably means thatNotItag100 originates from another bacterial species.

The CODE procedure mentioned above can efficiently be applied to theNotI flanking sequences (Li et al., Proc. Natl. Acad. Sci. USA, (2002)in press). Thus, the power and sensitivity of the passporting procedurecan be significantly increased by removing the most abundant specieswith the CODE technique (Li et al., 2001).

To be able to analyze complex microbial mixtures can be important formany applications. For instance, differences between individualcomposition of the normal flora will be instrumental for future analysisof how the normal flora composition is effected by diet, special foods,geographical location, colon diseases, autoimnunity, bacterial effectson colonic cancer risk, medication such as antibiotics and developmentof probiotics.

For this analysis we suggest to use generated restriction site taggedsequences. Hundreds of thousand tags can be produced in a short time,allowing careful analysis of thousands of bacterial species/strains(Velculesku et al., 1995). We have demonstrated that such NotI tags canbe efficiently produced and that such tags have high specificity. Thepower of the method can be increased using the CODE subtractiveprocedure. We also provide a database for ‘NotI passports’ (as it wasmentioned above it is more correct to speak about ‘RSTS passports’).Such database can be used together with a NotI (RST) microarraysdatabase (Li et al., Proc. Natl. Acad. Sci. USA, (2002) in press) asthese approaches are mutually complementary. This integrated databasegenerates new knowledge as these two approaches are based on completelydifferent biochemical techniques but aim to solve the same problem.

NotI—CODE Subtraction

Prior to the present invention, the inventors developed a new genomicsubtraction procedure called CODE, Cloning Of Deleted Sequences (Li etal., Biotechniques, (2001), 31: 788-793) that does not suffer from someof the limitations of RDA and RFLP subtraction. The CODE is based on themodification of the COP procedure, (Li, J., Wang, F., Zabarovska, V.,Wahlestedt, C., Zabarovsky, E. R., 2000, Cloning of polymorphisms (COP):enrichment of polymorphic sequences from complex genomes. Nucleic AcidsRes.), which is a new procedure for cloning single nucleotidepolymorphisms. Our major objectives were to develop a simple andreproducible procedure, and to improve subtractive enrichment, therebyavoiding excessive PCR kinetic enrichment steps that often generatesmall DNA products.

In the CODE procedure, a combination of digestion with restrictionenzymes, treatment with uracil-DNA glycosylase (UDG) and mung beannuclease, PCR amplification and purification with streptavidin magneticbeads, were used to isolate deleted sequences from the genomes of twohuman samples. The CODE has proved to be a rather simple, efficient androbust procedure.

In the present invention two questions had to be answered:

-   -   i) is it possible to use the CODE procedure for restriction        enzymes containing CG in their recognition site and    -   (ii) is it possible to use NotI clones for genome wide screening        for deleted, amplified and methylated NotI sites.

If the CODE procedure would work for the enzymes cutting in CpG islands,then it would be possible to clone not just deleted sequences (probablydeleted by chance and without any meaning), but also genes that can beassumed as being candidate disease genes.

We suggest to use only regions surrounding NotI sites for subtraction.The novelty of this approach is that these regions are enriched andpurified using circularisation. We have designed special primers and aprocedure to obtain the NotI representations (NR). The other principlesfor this subtraction were the same as in the CODE procedure but genomicDNA was digested with BamHI+BglII and NotI and other linkers were usedto allow PCR amplification of fragments containing only NotI. Other DNAfragments were not amplified. Only two cycles of subtraction were usedhere.

To validate this approach, we compared a lung tumour cell line ACC-LC5that contained a 0.7 Mb homozygously deleted region in 3p21-p22, withnormal lymphocyte control DNA. We did not know if this cell linecontained homozygous deletions in other chromosomes. This normal DNA isnot a completely appropriate control because it was isolated fromanother individual. We expected cloning of polymorphic sequences as wellas deleted.

An overview of the subtractive procedure is shown in FIG. 1. Tester anddriver DNA 15 was digested with BamHI+BglII and self-ligated at very lowconcentration of DNA to form circles. Intermolecular ligation does notcreate any problems because the vast majority (99.99%) of these ligatedmolecules will be not PCR amplified in the further steps. Even rarecases, such as when these two ligated molecules contain closely locatedNotI sites and will be able to be PCR amplified, are useful, since theyserve to normalize the representativity of different NotI surroundingsequences. Then these circles were digested with NotI. The majority(approximately 99.9%) of the circles will not be opened and thus will beomitted from further reactions. This serves also to decrease backgroundhybridization due to illegitimate ligation of NotI linker to the DNAfragments with BamHI or BglII sticky ends.

The driver DNA was amplified with dUTP and unmodified primers and testerDNA were amplified with biotinylated primers in the presence of normaldNTPs. The products of DNA amplification (on average 0.5-1.5 kb) weredenatured and hybridized at a ratio of 1:100 for the tester to driverDNA. After hybridization had been completed, the products were treatedwith UDG (which destroyed all the driver DNA) and mung bean nuclease(which digested single stranded DNA and all the non-perfect hybrids).The resulting tester homohybrids were purified, concentrated withstreptavidin beads, and subjected to one more round of subtraction. Thefinal PCR product was amplified and cloned in the suitable vector, e.g.pBC KS(+) vector (Stratagene).

From our previous experiments we knew that the NIJ-003 and NL1-401clones were deleted in this cell line. We isolated DNA from 10 randomclones and sequenced them (to perform Southern blotting with these smallinserts was impossible due to high the CG content). In this experimentscheme, only short DNA sequences (300-400 bp) were obtained, but theirsize can be increased using long distance PCR. Two of these clonescontained NLJ-003 NotI site.

This experiment demonstrated that subtraction using NotI surroundingsequences is very efficient, since only 2 sites out of 10.000 NotI siteswere located in the homozygously deleted region and one of them wasfound after analysis of only 10 clones. Other clones can be eitherpolymorphic or/and hemizygously deleted since when CODE procedure wasapplied to the same pair of driver/tester the majority of informativeclones (11 of 19) fell under this category.

Thus, the present invention demonstrates that NotI—CODE procedure can beused for enzymes cutting in CpG islands.

Use of NR for NotI Clone Microarrays

Thereafter we decided to check if NR after labelling with ³²P could bedirectly used for detection of deleted NotI sites. Therefore, weprepared nylon filters with immobilized DNA from NotI linking clones.These filters were hybridized to NR of ACC-LC5 (NR-A) and normallymphocyte DNA (NR-B).

The results showed that these two NRs revealed different hybridizationpatterns: several clones hybridizing to NR-B did not hybridize to NR-A.First of all it is clear that homozygously deleted NLJ-003 and NL1-401were easily detected. To understand the reason why other clones failedto hybridize to NR-A, we selected 4 such clones and analysed them usingSouthern hybridization. Genomic DNA from ACC-LC5 and normal lymphocyteswere digested either with BamHI+BglII or with BamHI+BglII+NotI, resolvedby electrophoresis in agarose gel, transferred to nylon filter andhybridized to the ³²P labelled insert of a NotI linking clone (FIG.2:1-4). This experiment demonstrated that all these 4 clones exhibitedclear presence of a NotI recognition site in DNA from normal lymphocytesand absence of the corresponding NotI site in ACC-LC5 DNA.

As a next step we performed a similar experiment but used microarrays ofDNA from NotI linking clones immobilized to the glass slide. The mainidea of this application is shown in FIG. 3. If a particular NotI siteis present in the DNA then the circle will be opened with NotI andlabelled. However, if this NotI site is deleted or methylated then NRwill not contain the corresponding DNA sequences.

In a first experiment we used DNA isolated from a human-mouse microcellhybrid cell line MCH903.1 (containing the whole human chromosome 3) andMCH939.2 (chr. 3 del p14-p22). NR for MCH903.1 was labelled red and NRfor MCH939.3 was labelled green. Thus sequences deleted in MCH939.2should be red. Thereafter the deletion was precisely mapped (FIG. 4A).Before the present invention, one year of work would have been needed toobtain the same results.

In a second experiment DNA from ACC-LC5 was used again to prepare NR-Aand normal lymphocyte DNA was used for making NR-B. NR-A was labelledwith Cy3 (green) and NR-B with Cy5 (red). If both sequences are presentin both NR then combined colour will be close to yellow and if someclones are deleted in ACC-LC5 then colour for these clones will be morered (FIG. 4B). As it is shown in FIG. 4, homozygously deleted clonesNLJ-003 and NL1-401 can unambiguously be detected. Other clones showingredder colour most likely reflect the fact that in practically 100% ofthe cases SCLC deletion of 3p is detected. Some clones showed the samedisbalance as NLJ-003 and NL1-401. This can be explained by methylationof both alleles or deletion of one allele of a NotI site and methylation(or polymorphism) of the other. Indeed, as shown in FIG. 2:3-4, clonesNLM-132 and NR3-077 do not contain cleavable NotI sites. In two othercases (AP20 and NRL1-1) that were also completely red, the situation isdifferent. One allele is methylated and the other is deleted (FIG. 2:5-6and Table 5).

To further check the results of this hybridization. TaqMan probes weredesigned for 5 NotI linking clones. Quantitative real-time PCR wasperformed with these primers/probes using ABI Prism□Model 7700 Sequencedetector. The results of the quantitative PCR corresponded well with theNotI microarray hybridization, see Table 5 below.

Contamination of tumor DNA with normal DNA represents a serious problemfor the identification of tumor suppressor genes, Two RCC biopsiescontaining 30-40% contaminating normal cells were used in a controlexperiment to check the sensitivity of NotI microarrays tocontamination. One step of the NotI-CODE procedure was used beforehybridization, and the probe was labeled with only one dye. As shown inFIG. 4 (C, D), the hybridization clearly identified the two regions mostfrequently deleted in RCC, 3p21 telomeric (near NLJ-003) and 3p21centromeric (near NRL1-1). Therefore, the impurity problem that canoccur with tumor biopsies can be easily resolved with NotI microarrays.

EXAMPLES

Cell Lines and General Methods

In the present invention DNA isolated from a small cell lung carcinomacell line ACC-LC5 was used. This cell line contains homozygous 685-kbdeletion in 3p21.3-p22 and was used as a source for DNA A, driver. DNAisolated from normal human lymphocytes was a control DNA (DNA B,tester).

Isolation of DNA, Southern transfer, hybridization, etc. were accordingto standard methods described in the literature. Construction of Notlinking libraries was made as described above.

A standard protocol was used to prepare nylon filter replicas of thegridded NotI linking clones. Nylon filters contained 100 mappedchromosomes specific NotI linking clones and 15 random unmapped humanNotI linking clones. For hybridization to nylon filter replicas of thegridded NotI clones, NR probes were ³²-P labeled by PCR.

Sequencing gels were run on ABI 310 automated sequencers (Perkin Elmer)according to the manufacturers' protocols.

Growth of bacteria, other microbiology procedures, isolation of DNA,sequencing was performed according to standard methods.

The Modified NotI—CODE Procedure

Two oligonucleotides: NotX 5′-AAAAGAATGTCAGTGTGTCACGTATGGACGAATTCGC-3′and NotY: 3′-AAACTTACAGTGTGTGTCACGTATGGCTGCTTAAGCGCCGG-3′ were used tocreate the NotI linker. Annealing was carried out in a final volume of100 μl containing 20 μl of 100 μM NotX, 20 μl of 100 μM NotY, 10 μl of10× M buffer (Boehringer Mannheim) and 50 μl of H₂O. The reactionmixture was boiled for 8 min and allowed to cool slowly at roomtemperature (r.t.).

Two micrograms of DNA from ACC-LC5 cell line (DNA A) and normallymphocytes (DNA B) at a DNA concentration of 50 μg/ml were digestedwith 20 U of BamHI and 20 U of BglII (Boehringer Mannheim) at 37° C. for5 h, followed by heat-inactivation for 20 min at 65° C. Then 0.4 μg ofthe digested DNAs were circularized overnight with T4 DNA ligase(Boehringer Mannheim) in the appropriate buffer in 1 ml of the reactionmixture.

DNA was concentrated by precipitation in ethanol, partially filled inwith for example Klenow fragment and digested with 10 U of NotI at 37°C. for 3 h. Following digestion, NotI was heat inactivated and DNAs wereligated overnight in the presence of a 50 M excess of NotI linker atroom temperature.

PCR of tester amplicon (DNA B with NotI linker) was performed in 100 μlof a solution containing 67 mM Tris-HCl, pH 9.1, 16.6 mM (NH₄)₂SO₄, 1.0mM MgCl₂, 0.1% Tween 20, 200 μM dNTPs, 100 ng tester amplicon DNA, 400nM of biotinylated primer NotX and 5U of Taq polymerase.

PCR of the driver amplicon (DNA A with NotI linker) was performed in 20tubes using the NotX primer and the following modified conditions: dUTP(300 μM) was used instead of dTTP, and 2.5 mM MgCl₂ was used rather than1.0 mM MgCl₂. The PCR cycling conditions were 72° C. for 5 min, followedby 25 cycles of 95° C. for 1 min, 72° C. for 2.5 min, and a finalextension period at 72° C. for 5 min. These PCR amplified tester anddriver amplicons we call NotI representation (NR).

All PCR amplified DNA A samples were pooled (2000 μl) and mixed with 20μl of PCR amplified DNA B (for subtraction we used a ratio of 1:100 ofDNA B to DNA A). The pooled sample was concentrated by precipitation inethanol, purified using a JETquick PCR Purification Spin Kit (GENOMEDInc.), and dissolved in 100 μl H₂O. This DNA mixture was furtherconcentrated to 6 μl and boiled for 10 min under mineral oil.

Subtractive hybridization was performed for 40 h in 9 μl buffercontaining 0.4 M NaCl, 100 mM Tris-HCl, pH 8.5 and 1 mM EDTA. Afterhybridization, the mixture was diluted to 200 μl and extracted with anequal volume of chloroform: isoamyl alcohol (24:1) to remove the mineraloil.

Treatment with UDG (Boehringer Mannheim) was performed in a buffercontaining 70 mM Hepes-KOH, pH 7.4, 1 mM EDTA and 1 mM dithiothreitolwith 30 U UDG at 37° C. for 4 hrs. Then DNA was precipitated withethanol and dissolved in 25 μl of TE buffer. To this 3 μl of 10× MBNbuffer (30 mM sodium acetate, pH 4.6, 50 mM NaCl, 1 mM zinc acetate and0.001% Triton X-100) and 20 U of mung bean nuclease (BoehringerMannheim) were added and incubated at 37° C. for 30 min. The reactionwas stopped by the addition of EDTA to a final concentration of 1 mM.

The subtracted DNA was purified with streptavidin coupled DynabeadsM-280 (Dynal A. S, Oslo, Norway) according to the manufacturer'sinstructions and dissolved in 20 μl of TE buffer. Approximately 0.5 μlof this DNA preparation was PCR amplified as described above for DNA Bbut using only 8 cycles, before subjecting the amplified DNA to a secondround of hybridization.

The final subtraction product was PCR amplified, purified with JETquickPCR Purification Spin Kit (GENOMED Inc.) and digested with NotI. ThisDNA preparation was inserted into the pBC KS(+) vector (Stratagene),which was digested with NotI and dephosphorylated by alkalinephosphatase (Boehringer Mannheim).

Microarray Preparation, Hybridization and Scanning.

Microarrays were constructed essentially as described by Schena M. etal., 1996. In brief, DNA of NotI linking clones was spotted onto3-aminopropyl-trimethoxysilane-coated glass microscope slides. Majorityof NotI clones contained inserts 2-12 kb (vector part was 3.8 or 4.5 kb,see Zabarovsky et al., 1990). Qiagen-purified DNAs were dissolved in TEand arrayed using GMS 417 Arrayer (Genetic MicroSystems, Woburn, Mass.)with the spot density at 375 μm. The arrays were subsequently air dried,submerged in 70% EtOH for 30 min at room temperature, air dried again,and stored in the dark at −20° C. The microarrays described herecontained 150 sequence-validated human chromosome 3-specific STSs in sixrepetitions, representing 61 known and 49 unknown expressed sequencetags.

The NR probes were labelled in a PCR reaction with the NotX primer.Incorporation of digoxigenin or biotin was done using PCR DIG LabellingMix (Boehringer Mannheim) or Biotin Reaction Mix (MICROMAX, NEN LifeScience Products, Inc., Boston, Mass.). PCR products were purified usingMicroSpin PCR Purification Columns (Saveen) and efficiency of thelabelling was determined by membrane-based chemiluminescence analysis(MICROMAX, NEN).

Alternative method for preparing NR with low quality DNA was also used.According to this method genomic DNA was simultaneously digested withNotI and another enzyme or combination of enzymes not having CpG pairsin the recognition sites (e.g. Sau3A or BamHI+BglII).

After inactivation of the two enzymes, specific adaptors Sau00N andNBSgt99 were ligated to them: Sau00N 5′-GATC CTC AAA CGC GT-3′-Amine3′-GAG TTT GCG CAC AGC ACT GAC CCT TTT GGG ACC-5′ NBSgt99 5′-GGC CTC CAGAAA ACA TCC ACG GGC TCT AGG ATA GAT CGC-3′ 3′-AG GTC TTT TGT AGG-5′

Thereafter, NR was prepared using PCR in the presence of Zuniv and Zgtprimers. The PCR cycling conditions were 95° C. for 2 min, followed by25 cycles of 95° C. for 45 sec, 65° C. for 30 sec and 72° C. for 1.5min. In general, these NRs showed the same results in hybridizationexperiments but the background was usually higher.

Qualified Dig- and Bio-labelled probes were combined, denatured at 99°C., 2 min, and hybridized with denatured (0.1M NaOH, 2 min, r.t.)microarrays in the Hybridization Buffer (MICROMAX, NEN) for 5 h at 65°C.

The arrays were washed for 5 min at r.t. in low stringency buffer(0.06×SSC, 0.01% SDS) and developed using TSA system (MICROMAX, NEN)according to the manufacturer's protocols. In brief, we incubatedmicroarrays with anti-DIG antibodies conjugated with horseradishperoxidase (Boehringer Mannheim) and than with Cyanine-3-Tyramidesolution. After inactivation of the peroxidase in this first layer,Streptavidin-HRP Conjugate was applied and biotin residues werevisualized by Cyanine-5-Tyramide.

The arrays were scanned using GMS 418 Scanner (Genetic MicroSystems,Woburn, Mass.), analyzed and represented by ImaGene 3.05 software(Biodiscovery). Accurate measurements of Cy3/Cy5 fluorescence ratioswere obtained by taking the average of the ratios of all six spottedrepetitions.

Quantitative Real-Time PCR with TagMan Probes

Oligonucleotide primers and probes were designed to amplify 5 NotIlinking clones: NRL1-1 (3p21.2), NL3-001 (3p21.2-21.32), NL1-205(3p21.2-21.32), NLj3 (3p21.33), 924-021 (3p12.3). huBA—beta-actin genewas used as reference sequence (endogenous control). Final selection ofprimer and probe sequences, except huBA, was performed using the ABIPrimer Express Software Version 1.5 (PE-Applied Biosystems, Foster City,Calif., USA) according to the manufacturer's instruction. TaqMan probesand primers were obtained from Perkin-Elmer. TaqMan probe consists of anoligonucleotide with a 5′-fluorescent reporter dye and a 3′-quencherdye. NLj3, NRL1-1 and hu□A probes contained FAM (6-carboxy-fluoroscein),NL3-001, NL1-205 and 924-021R probes contained JOE(2,7-dimethoxy-4,5-dichloro-6-carboxy-fluoroscein) as reporter dyes,located at the 5′-ends. All reporters were quenched by TAMRA(6-carboxy-N,N,N′,N′-tetramethyl-rhodamine), conjugated to the3′-terminal nucleotides. The resulting sequences are given below inTable 6

PCR reactions were carried out in 25 μl volumes consisting of 1×PCRbuffer A: 10 mM Tris-HCl, 10 mM EDTA, 50 mM KCl, 60 nM passive referenceA, pH 8.3 at room temperature; 3.5 mM MgCl₂, 200 μM DATP, dGTP, dCTP,400 μM dUTP, 100 nM TaqMan probe, forward and reverse primers inappropriate concentrations, 0.025 unit/μl AmpliTaq Gold DNA polymerase,0.01 unit/μl AmpErase and 5 μl of appropriate diluted DNA template. H₂Owas added to 25 μl of total volume. PCR were performed using ABI Prism®Model 7700 Sequence Detector. The reactions were done in triplicate foreach sample in the same or separate tubes.

The primer limitation experiments were performed for multiplex PCR withmore than one primer pair in the same tube (ABI PRISM 7700 SequenceDetection System. User Bulletin no.2. Relative quantitation of GeneExpression. PE Applied Biosystems, 1997). Thermal cycling conditionsconsisted of 2 min at 50° C., 10 min at 95° C., followed by 40 cycles of15 s at 95° C. and 1 min at 60° C.

Cycle threshold (C_(T)) determinations (i.e. calculations of the numberof cycles required for reporter dye fluorescence resulting from thesynthesis of PCR products to become significantly higher than backgroundfluorescence levels) were automatically performed by the instrument foreach reaction.

Details concerning the theory and derivation of the comparative C_(T)method (ΔΔC_(T) method) for target sequence quantitative assessment hasbeen published (ABI PRISM 7700 Sequence Detection System. User Bulletinno.2. Relative quantitation of Gene Expression. PE Applied Biosystems,1997). This method is dependent upon the inverse exponentialrelationship that exists between starting quantity (number) of targetsequence copies in the reactions and corresponding CT determinations bythe ABI7700 system: the more copies, the less value CT (ABI PRISM 7700Sequence Detection System. User Bulletin no.2. Relative quantitation ofGene Expression. PE Applied Biosystems, 1997). We used an approachreferred to as the comparative cycle threshold (CT) method to determinetarget sequence quantity of tumour sample—ACC-LC5, (target) relative tothose in the sample for comparison—normal DNA, (calibrator) and comparedwith an endogenous control sequence—beta-actin (reference) in bothsamples. For amplicons designed and optimized according to PE AppliedBiosystems 10 guidelines, efficiency is close to 100%. In this case, theamount of target (copy number), normalized to an endogenous referenceand relative calibrator, is given by:N_(ACC-LC)5/N_(calibrator)=2^(−ΔΔCT). The calculation ΔΔC_(T) involvessubtraction of mean reference sequence C_(T) values from mean targetsequence C_(T) for ACC-LC5 and CBMI, to obtain values ΔC_(T) _(ACC-LC5)=C_(T) _(target) −C_(T) _(actin) and ΔC_(T) _(norm) =C_(T) _(target)−C_(T) _(actin) . The values ΔC_(T) _(norm) are then subtracted fromvalues ΔC_(T) _(ACC-LC5) to obtain ΔΔC_(T). The range given for allprobes relative to β-actin was determined the expression: 2^(−ΔΔCT) withΔΔC_(T)+s and ΔΔC_(T)−S, where s=the standard deviation of the ΔΔC_(T)value.

For the ΔΔC_(T) calculation to be valid, the efficiency of the targetamplification and efficiency of the reference amplification must beapproximately equal. Before using the ΔΔCT method for quantitativeassessment a validation experiment was performed (ABI PRISM 7700Sequence Detection System. User Bulletin no.2. Relative quantitation ofGene Expression. PE Applied Biosystems, 1997). The performed validationexperiments demonstrated that efficiencies of these targets andreferences are approximately equal for chosen dilutions. In this case wecan use the ΔΔCT calculations for the relative quantitation of targetwithout using standard curves.

Data analysis was done using Sequence Detection System (SDS) software(PE-Biosystems).

The NotI-Passporting Procedure

Two oligonucleotides, BfocII: 5′-ggatgaaaactgga-3′ and Z98NOT:3′-gtcgtgactgggaaaaccctggcctacttttgacctccgg-5′ were used to create theNotI linker.

Two micrograms of bacterial DNA at a concentration of 50 μg/ml weredigested with 20 U NotI (Roche Molecular Biochemicals) at 37° C. for 2 hand heat-inactivated for 20 min at 85° C. Then, 0.4 μg of the digestedDNA was ligated to NotI linker (50 M excess) overnight with T4 DNAligase (Roche Molecular Biochemicals) in the appropriate buffer in100-μl reaction mixtures. The DNA was then concentrated by precipitationin ethanol and digested with 10 U BpmI at 37° C. for 3 h.

Following digestion, BpmI was heat-inactivated and the DNA was ligatedovernight in the presence of a 50 M excess of the ZNBpm linker at roomtemperature. Two nucleotides, the Zamine: 5′-ctcaaaccgt-3′ and theZ2_univer: 3′-Nngagtttggcacagcactgacccttttgggacc-5′

were used to create the ZNBpm linker.

The sample was then purified using a JETquick PCR Purification Spin Kit(GENOMED Inc.), and dissolved in 100 μl TE. One microliter of thissample was PCR amplified with Z1 univer(3′-gagtttggcacagcactgacccttttgggacc-5′) and antiuniver(5′-cagcactgacccttttgggacc-3′) primers.

PCR was performed in 40 μl solution containing 67 mM Tris-HCl (pH 9.1),16.6 mM (NH₄)₂SO₄, 2.0 mM MgCl₂, 0.1% Tween 20, 200 μM dNTPs, 3 μl PCRpool, 400 nM of each primer, and 5 U Taq DNA polymerase. The PCR cyclingconditions were 95° C. for 1.5 min, followed by 25 cycles of 95° C. for1 min, 60° C. for 1 min, with 72° C. for 0.5 min, with a final extensionperiod at 72° C. for 3 min.

The final product was purified with the JETquick PCR Purification SpinKit (Genomed GmbH) and cloned using TOPO TA Cloning kit (Invitrogen AB,Sweden). Sequencing gels were run on ABI 377 automated sequencers(Perkin Elmer), according to the manufacturers' protocols, usingstandard primers.

For the analysis of the complex flora composition, we suggest using onlysome specific fragments of the genomes (e.g. NotI representations, NotItags, NotI linking clones, etc.). Thus we do not aim to sequence allgenomes or study all genes. We append special signatures for theparticular microorganism/genes and analyze these signatures in differentsamples of colon flora. In the present invention study work we haveanalyzed the use of short sequence tags appended to NotI or otherrestriction enzyme recognition site. The collection of NotI tagsrepresents NotI sequence passport or in short NotI passport and NotIpassporting means creation of NotI tags/passports. The naming is basedon the initially used enzymes, but the methods can be adapted to otherrestriction enzymes as well.

The general design of the experiment is as follows (FIG. 5). DNAgenerated from faecal samples and surgical specimens are digested withNotI and ligated to special linker containing BpmI recognition site.Then DNA is digested with BpmI, ligated to the special linkers and PCRamplified. We have proved that in these conditions only specific 85 bpNotI-BpmI fragments are amplified (FIG. 6). After digestion with BpmIand FokI this fragment will generate 24 bp fragments which representparticular NotI sites. From here it is possible to work in twodirections.

a) Concatemer Strategy

The 24 bp units will be ligated into the concatemers of about 1.000 bpsize, cloned and sequenced. Each sequencing reaction will giveinformation about 20-50 NotI sites.

b) Oligomer Strategy.

New high-throughput sequencing techniques, such as pyrosequencing ormassively parallel signature sequencing have been developed recently.They allow one person to produce many thousands sequences per day.However, these sequences are very short 20-40 bp and suit our needswell, whereby NotI passport for the particular specimen can be produced.Comparing these passports from e.g. different individuals or from thesame individual before and after drug treatment we find the differencebetween them. This information in some cases can be directly used tomake conclusions. In other cases, using these sequences we can identifyNotI linking clones which are different between two samples. Theseclones can be used for further analysis, e.g. finding the genes whichare responsible for a certain medical condition (e.g. cancer, agingetc.) or sequencing/isolation of the required microorganism. TABLE 1Comparison of different microarrays to study genome copy number changesand methylation.* CGH RST Method/ (BACs, P1, (NotI Feature cDNA PACs)Representation SNP CGI microarrays) Homozygous Low Yes Yes/NO NO Yes/NOYes deletions Hemizygous Low Yes NO NO NO Yes deletions LOH NO NO YesYes Yes/NO Yes Ampli- Low/ Yes Yes NO Yes Yes fication MediumMethylation NO NO NO NO NO Yes Number of More than 10.000-30.000 1.5001.300 1.500 10.000-20.000 available 40.000 (polymorphic (can beincreased) (polymorphic BglII markers BglII fragments fragments per pergenome) genome) Connection Direct Indirect NO (indirect) NO (indirect)NO (indirect) Direct to genes Main Low sensitivity Very Not convenientHigh sensitivity to Not convenient for High CG content; disad- andprecision; expensive, for large-scale normal cell large-scale small sizeof the vantages small size of difficulties screening; smallcontamination, screening; small inserts; the insert; to work with sizeof the short hybridizing size of the inserts; discrimination highlarge-insert inserts; only 1% sequences; many only 1% between LOH,background, vectors: polymorphic reactions and polymorphic sites;deletions/amplifications several EST low yield, sites; unknown primersare needed; unknown location; and markers rearrangements; location; 2.5%expensive; less than 2.5% of all DNA is methylation should be 100% DNAof all DNA is 30% of markers are labeled, unknown used to is labeledlabeled, polymorphic purification from determine (many unknown repeats.copy number; repeats) purification 100% DNA is from repeats. labeled(many repeats) Main Direct Good to Can be easily Very small fraction Canbe easily Methylation advantages connection to check copy adopted for ofthe genome is adopted for small detection; up to RNA profiling numbersmall scale labeled; good to scale experiments 45% of clones are changesexperiments check LOH (small genomes) polymorphic, easy (small genomes)to solve normal cell contamination, one reaction and one pair of primersare used; comparatively cheap, good to check LOH and copy numberchanges, only 0.1-0.2% of the genome is labeled; 10 fold purificationfrom repeats; direct connection to genes; simultaneous detection ofdifferent aberrations associated with cancer development*Efficiency of the method to detect the particular feature

TABLE 2 Comparison of NotI and CGI microarrays Feature NotI microarraysCGI-microarrays Uncomplete No effect Artificial result restrictiondigestion Specificity of 0.1-0.5% of the total 100% total human DNAlabeling human DNA Repeats 10% compared to the Approximately the same asaverage in human in average genome rRNA genes No Yes Homozygous Yes Nodeletions Hemizygous Yes No deletions Hemizygous Yes No methylationOligo microarrays Yes ??? Homozygous Yes Yes methylation in cancer cellsQuality of clones All sequenced, all Partly sequenced, many containgenes repeated sequences and repeats like LINE etc. Number of >5.000Unknown available clones

TABLE 3 The number of recognition sites for rare cutting restrictionenzymes in selected bacterial genomes GENOME SIZE NotI PacI PmeI SbfISgfI SgrAI Sse2321 SwaI 1 Bacillus subtilis 4, 2 81 89 89 51 51 157 52176 2 Borrelia burgdorferi 1, 5 1 234 37 8 0 2 0 548 3 Campylobacterjejuni 1, 6 0 91 42 13 5 1 0 526 4 Chlamydophila pneumoniae AR39 1, 2 259 10 21 13 4 1 60 5 Deinococcus radiodurans R1 3, 3 15 1 4 28 7 645 1641 6 Escherichia coli K12 4, 6 23 143 87 68 222 548 31 117 7 Escherichiacoli O157:H7 5, 5 36 165 92 108 239 642 34 126 8 Helicobacter pylori26695 1, 7 7 32 35 4 88 61 12 67 9 Helicobacter pylori J99 1, 6 14 34 434 87 66 15 76 10 Lactococcus lactis subsp. lactis 2, 4 3 176 47 17 2 110 235 11 Rickettsia prowazekii 1, 1 1 239 20 10 1 4 0 229 12Staphylococcus aureus Mu50 2, 9 0 440 83 12 5 12 2 602 13 Streptococcuspneumoniae R6 2, 0 1 40 25 30 1 9 0 51 14 Synechocystis PCC6803 3, 6 44192 104 40 3160 182 18 167 15 Vibrio cholerae 4, 0 73 103 117 37 203 19924 104

TABLE 4 Specificity of restriction tags in E. coli and H. pyloristrains. Cutting sites in Unique for Unique for the genome the speciesthe strain Species Strain PmeI SbfI NotI Total PmeI SbfI NotI Total PmeISbfI NotI Total Escherichia K12 (4.6 Mb) 87 68 23 178 74 61 20 155 25 266 57 coli O157H7 (5.5 Mb) 92 108 36 236 77 90 34 201 28 55 20 103Helicobacter 26695 (1.7 Mb) 35 4 7 46 35 4 7 46 21 2 4 27 pylori J99(1.6 Mb) 43 4 14 61 43 4 14 61 33 2 11 46

TABLE 5 Relative quantitative measurements using comparative (ΔΔC_(T))method for normal lymphocyte DNA and ACC-LC5 cell line N_(ACC-LC5)/Target/colour Location N_(norm) = 2^(−ΔΔCT) Comments 924-021/yellow3p12.3 0.94 (0.83-1.05) No changes NRL1-1/red 3p21.2 0.51 (0.41-0.62)Initial target sequence copy number in ACC-LC5 is half of what isobtained in CBMI (hemizygous deletion) NL3-001/yellow 3p21.2-21.32 1.12(0.98-1.26) No changes NL1-205/yellow 3p21.2-21.32 1.25 (0.75-1.74) Nochanges NLj3/red 3p21.33 0.00 Zero sequence copy number (homozygousdeletion)

TABLE 6 TaqMan probe, primer sequences and product lengths Amplicon,Target Oligonucleotide Sequence (5′ → 3′) bp 924-021 924-021, probeTGCTGGCCACAGGCCCTGC 52 (3p12.3) primer(F) TGCATGTGCCAGTGTTGATAAAprimer(R) GTGTTGTGAGCCCTGGGAA NRL1-1 NRL1-1, probeAGCCTGAGCTGGGCAGACAGTTTCC 74 (3p21.2) primer(F) CAGCCCCACGGTCACTTCprimer(R) GCCAAAACAGACCCAGCCT NL3-001 NL3-001, probe CCCCAGAAACGCGCGGGC60 (3p21.2 -21.32) primer(F) CTTGCCATCTGCAATTCCCT primer(R)CTCCATGAGGCTGTGGGAAG NL1-205 NL1-205, probe GCGGCTGGCTCTGCGC 63 (3p21.2-21.32) primer(F) ATGAGGCTCTTTCCCATGCC primer(R) GCCGGATTCAGGATGCTTTNLj3 NLj3, probe CTGGCGGAGAGACTGGGAGCGA 125 (3p21.33) primer(F)CAGAGTGCGTGTGCCGACT primer(R) ACAACTTCTCTGCGGGCGT hu?A hu?A, probeATGCCCCCCCCATGCCATCCTGCGT 295 (control) primer(F)TCACCCACACTGTGCCCATCTACGA 7 chromosome primer(R)CAGCGGAACCGCTCATTGCCAATGG

REFERENCES

-   Bicknell, D. C., Markie, D., Spurr, N. K., Bodmer, W. F., 1991. The    human chromosome content in human x rodent somatic cell hybrids    analyzed by a screening technique using Alu PCR. Genomics 10,    186-192.-   Brookes, A. J., 1999. The essence of SNPs. Gene 234, 177-186.-   Costello J F, Fruhwald M C, Smiraglia D J, Rush L J, Robertson G P,    Gao X, Wright F A, Feramisco J D, Peltomaki P, Lang J C, Schuller D    E, Yu L, Bloomfield C D, Caligiuri M A, Yates A, Nishikawa R, Su    Huang H, Petrelli N J, Zhang X, O'Dorisio M S, Held W A, Cavenee W    K, Plass C. Aberrant CpG-island methylation has non-random and    tumour-type-specific patterns. Nat Genet 2000 24(2): 132-138-   Espinosa-Urgel, M., Kolter, R., 1998. Escherichia coli genes    expressed preferentially in an aquatic environment. Mol. Microbiol.    28, 325-332.-   Ishikawa, S., Kai, M., Tamari, M., Takei, Y., Takeuchi, K., Bandou,    H., Yamane, Y., Ogawa, M., Nakamura, Y., 1997. Sequence analysis of    a 685-kb genomic region on chromosome 3p22-p21.3 that is    homozygously deleted in a lung carcinoma cell line. DNA Res. 4,    35-43.-   Kaiser, C., Von Stein, O., Laux, G., Hoffmann M., 1999. Functional    genomics in cancer research: identification of target genes of the    Epstein-Barr virus nuclear antigen 2 by subtractive cDNA cloning and    high-throughput differential screening using high-density agarose    gels. Electrophoresis 20, 261-268.-   Li, J., Wang, F., Zabarovska, V., Wahlestedt, C., Zabarovsky, E.    R., 1999. Cloning of polymorphisms (COP): enrichment of polymorphic    sequences from complex genomes. Nucleic Acids Res., in press.-   Lisitsyn, N., Lisitsyn, N., Wigler, M., 1993. Cloning the    differences between two complex genomes. Science 259, 946-951.-   Lisitsyn, N. A., Segre, J. A., Kusumi, K., Lisitsyn, N. M.,    Nadeau, J. H., Frankel, W. N., Wigler, M. H., Lander, E. S., 1994.    Direct isolation of polymorphic markers linked to a trait by    genetically directed representational difference analysis. Nat.    Genet. 6: 57-63.-   Lisitsyn, N. A., Lisitsina, N. M., Dalbagni, G., Barker, P.,    Sanchez, C. A., Gnarra, J., Linehan, W. M., Reid, B. J., Wigler, M.    H., 1995. Comparative genomic analysis of tumours: detection of DNA    losses and amplification. Proc. Natl. Acad. Sci. USA 92: 151-155.-   Parikh, V. S., Morgan, M. M., Scott, R., Clements, L. S., Butow, R.    A., 1987. The mitochondrial genotype can influence nuclear gene    expression in yeast. Science 235: 576-580.-   Rosenberg, M., Przybylska, M., Straus, D., 1994. ‘RFLP subtraction’:    a method for making libraries of polymorphic markers. Proc. Natl.    Acad. Sci. USA, 91: 6113-6117.-   Sambrook, J., Fritsch, E. F., Maniatis, T., 1989. Molecular Cloning:    A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold    Spring Harbor, N.Y.-   Sugai, M., Kondo, S., Shimizu, A., Honjo, T., 1998. Isolation of    differentially expressed genes upon immunoglobulin class switching    by a subtractive hybridization method using uracil DNA glycosylase.    Nucleic Acids Res. 26, 911-918.-   Sugimura T, Ushijima T. Genetic and epigenetic alterations in    carcinogenesis. Mutat Res (2000) 462(2-3): 235-246.-   Yamakawa, K., Takahashi, T., Horio, Y., Murata, Y., Takahashi, E.,    Hibi, K., Yokoyama, S., Ueda, R., Takahashi, T., Nakamura, Y., 1993.    Frequent homozygous deletions in lung cancer cell lines detected by    a DNA marker located at 3p21.3-p22. Oncogene 8, 327-330.-   Yan P S, Chen C M, Shi H, Rahmatpanah F, Wei S H, Caldwell C W,    Huang T H. Dissecting complex epigenetic alterations in breast    cancer using CpG island Microarrays. Cancer Res 2001 61(23):    8375-8380-   Zabarovsky, E. R., Boldog, F., Thompson, T., Scanlon, D., Winberg,    G., Marcsek, Z., Erlandsson, R., Stanbridge, E. J., Klein, G.,    Sumegi, J., 1990. Construction of a human chromosome 3 specific NotI    linking library using a novel cloning procedure. Nucleic Acids Res.    18, 6319-6324.-   Zabarovska, V., Li, J., Muravenko, O., Fedorova, L., Ernberg, I.,    Wahlestedt, C., Klein, G. and Zabarovsky, E. R. CIS—cloning of    identical sequences between two complex genomes. Chromosome    Research, in press.-   Allikmets et al.: NotI linking clones as tools to join physical and    genetic mapping of the human genome. Genomics, (1994) 19: 303-309.-   Kashuba et al.: Analysis of NotI linking clones isolated from human    chromosome 3 specific libraries. Gene, 1999, 239: 259-271.-   Katouli et al. Composition and diversity of intestinal coliform    flora influence bacterial translocation in rats after hemorrhagic    stress. Infection and immunity 62: 4768-4774, 1994-   Klein, J. (1999). Batrachomyomachia: frog 1, mice 0. Scand J    Immunol, 49: 11-13.-   Li J. et al.: COP—a new procedure for cloning single nucleotide    polymorphisms. Nucleic Acids Res. (2000) 28, e1,p. i-v.-   Li et al.: CODE: a new genomic subtraction method for cloning    deleted sequences. Biotechniques, (2001), 31: 788-793.-   Midtvedt, T. (1999). In L. A. Hansson and R. H. Yolken (eds.),    Microbial functional Activities. Lippincott-Raven, Philadelphia,    Vol. Nestle Nutritional Workshop Series, pp. 79-96.-   Ronaghi et al. Pyrosequencing—a DNA sequencing method based on    real-time pyrophosphate detection. Science (1998) 281: 363-365.-   Sandberg et al.: Capturing whole-genome characteristics in short    sequences using a naïve Bayesian classifier. Genome Res. 2001,    11:1404-1409.-   Velculesku et al.: Serial analysis of gene expression.    Science (1995) 270:484-487.-   Zabarovska et al.: Slalom libraries: a new approach to genome    mapping and sequencing. Nucleic Acids Res. (2002) 30 (e6): 1-8.-   Zabarovsky et al.: Construction of a human chromosome 3 specific    NotI linking library using a novel cloning procedure. Nucl. Acids    Res., 1990, 18:6319.-   Zabarovsky et al.: A new strategy for mapping the human genome based    on a novel procedure for constructing jumping libraries. Genomics,    1991, 11: 1030-1039.-   Zabarovsky et al.: NotI clones in the analysis of human genome.    Nucl. Acids Res., (2000) 28: 1635-1639.-   Zabarovsky et al.: Novel techniques to identify the species    composition of complex microbial systems: restriction site tagged    microarrays (RST) and NotI signatures. Microecology and Therapy, in    press.

1. Method for preparing nucleic acid or and/or modified nucleic acidreference material bound to a solid phase, comprising steps of:digesting nucleic acid and/or modified nucleic acid reference materialusing biochemical and/or chemical approaches, to obtain sequencefragments surrounding a specific restriction enzyme recognition site,selecting said nucleic acid and/or modified nucleic acid sequencefragments flanking a specific restriction enzyme recognition site. 2.Method according to claim 1, wherein said reference material is digestedby a first restriction enzyme and/or one or more second restrictionenzymes.
 3. Method according to claim 2, wherein the restriction enzymesare endonucleases.
 4. Method according to claim 3, wherein therecognition sites of the first endonuclease is scarcely distributedalong said genomic material and is located adjacent to gene sequences,and the recognition sites of said one or more second restrictionendonucleases are more frequently occurring along said genomic materialthan the sites of the first endonuclease.
 5. Method of claim 4, whereinthe digestion by the first and second restriction endonucleases areperformed simultaneously, and different linkers are ligated to the endsresulting from cutting by the first and second restrictionendonucleases, respectively, which linkers are designed such that whenprimers are added in order to make PCR reactions, only the fragmentscontaining ends resulting from cutting by the first restrictionendonuclease will be amplified.
 6. Method of claim 4, wherein thereference material is first digested by the one or more secondrestriction endonucleases, the ends of the thus obtained fragments areself-ligated into the form of circular nucleic acid and/or modifiednucleic acid molecules, and any linear fragments remaining afterself-ligation are inactivated before digestion with the firstrestriction endonuclease, whereby the linear fragments resulting fromthe digestion by the first endonuclease are subjected to PCRamplification.
 7. Method of claim 2, wherein the first restrictionendonuclease in NotI, or any other restriction endonuclease, therestriction sites of which occurs in proximity to CpG islands in thegenomic material.
 8. Method of claim 2, wherein the first restrictionendonuclease is NotI, PmeI or Sbfl, or a combination of two or more ofsaid endonucleases, and the second endonuclease is BamHI, BclI, BglII orSau3A, or a combination of two or more of said endonucleases.
 9. Methodaccording to claim 1, wherein said nucleic acid and/or modified nucleicacid reference material is selected from RNA, DAN, peptides or modifiedoligonucleotides, or a combination of two or more of said materials. 10.Method according to claim 1, wherein the solid phase is a glass slide,coded beads, cellulose, such as nitrocellulose, or filters.
 11. Methodof claim 1, wherein the genomic material is derived from one or morehumans, from different locations in the body/bodies and at the same ordifferent points in time.
 12. Method of claim 1, wherein the genomicmaterial is derived from bacteria from the gut, skin or other parts ofthe human body.
 13. Method of claim 1, wherein the genomic material isderived from any organism, bacteria, animal, or plant, or productproduced there from, or from any substance wherein genomic material canbe contained, especially air and water.
 14. Use of representation of thegenome, or of a part thereof, of an organism, comprising multiple copiesof the nucleic acid and/or modified nucleic acid fragments, or aselection thereof, obtained by means of the method of claim 1 indiscriminating between different genomes, detecting methylations,deletions, mutations and other changes within genomic material obtainedfrom the same individual at different points of time, or in the genomicmaterial obtained from one individual as compared to a standardrepresentation obtained from at least one other individual, or acombination thereof.
 15. Use of the representation according to claim14, wherein the representation in liquid form is hybridized to thenucleic acid and/or modified nucleic acid fragments present in the formof said solid phases.
 16. Use of the representation of the genome, or ofa part thereof, of an organism, comprising multiple copies of thenucleic acid and/or modified nucleic acid fragments, or a selectionthereof, obtained by means of the method of claim 1 for: studyingmethylation and copy number changes in eukaryotic genomes for diagnosis,prognosis, identification of cancer causing genes, etc, genotypingdifferent microorganisms (viruses, prokaryotic, eukaryotic), studyingbiocomplexity and diversity of complex biological systems, i.e. humangut, bacterial flora in water, food, air resources, identifyingpathogenic organisms in different sources including complex biologicalmixtures, producing passports (images of microarrays hybridizations,database containing tag sequences) for different purposes: to describeorganisms at different conditions, i.e. different ages, disease/healthy,infected/uninfected etc, identifying new organisms, e.g. bacterialspecies, producing microarrays (DNA-and oligo-based) to study all abovedescribed features, verification and maintenance of large biologicalcollections/banks, i.e. verifying cell lines and individual organismsfor higher organisms and confirming the purity of the particular strainfor microbial species, producing kits for labeling and hybridizationwith microarrays, producing kits for making sequence tagging(passporting), and producing oligo microarrays to analyze sequence tags.17. Use of the representation according to claim 16, wherein therepresentation in liquid form is hybridized to the nucleic acid and/ormodified nucleic acid fragments present in the form of said solidphases.
 18. NotI genomic subtraction method for cloning deletedsequences (CODE-genomic subtraction method) based on the use offragments obtained by the method for preparing nucleic acid or and/ormodified nucleic acid reference material bound to a solid phase,comprising the steps of digesting nucleic acid and/or modified nucleicacid reference material using biochemical and/or chemical approaches, toobtain sequence fragments surrounding a specific restriction enzymerecognition site, selecting said nucleic acid and/or modified nucleicacid sequence fragments flanking a specific restriction enzymerecognition site. 19-20. (canceled)